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Schematic representation of Brassica 2S protein processing (A) and the predicted 
^ processing sites for the 2S protein of T. cacao (B) as well as the amino acid sequence of 

the mature polypeptide. 

O 

^ (57) Abstract: A novel 2S cocoa albumin was isolated, purified and identified from cocoa beans. Enymatic hydrolysis of the protein 
generated a pool of flavour precursors, peptides and amin o acids that resultd in formation fo cocoa flavour upon heating with sugars. 
^ The DNA encoding a precursor cocoa 2S protein was isoltated from immature Theobroma cocoa seeds. 
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A novel cocoa albumin and its use in the production of cocoa and chocolate flavour 

5 

The present invention relates to a novel cocoa polypeptide and the DNA encoding it. hi 
particular, the present invention pertains to the use of said polypeptide and/or fragments 
thereof in the production of cocoa/chocolate flavour. 

10 In processing cocoa beans the generation of the typical cocoa flavour requires two steps, the 
fermentation step and the roasting step, During fermentation the pulp surrounding the beans 
is degraded by micro-organisms with the sugars contained in the pulp being essentially 
transformed to acids. Fermentation also results in a release of peptides exhibiting differing 
sizes and a generation of a high level of free hydrophobic amino acids. This latter finding led 

15 to the hypothesis that proteolysis occurring during fermentation is not due to a random 
protein hydrolysis but seems to be rather based on the activity of specific endoproteinases. 
This specific mixture of peptides and hydrophobic amino acids is deemed to represent 
cocoa-specific flavour precursors. During the second step of cocoa flavour production, the 
roasting step, the oligopeptides and amino acids generated at the stage of fermentation 

20 obviously undergo a Maillard reaction with reducing sugars present eventually yielding the 
substances responsible for the cocoa flavour as such. 

So far, research has tried to uncover the molecular pathway of producing cocoa flavour pre- 
cursors in characterizing enzymes involved in said process and the relevant polypeptide(s), 
25 from which the peptides and/or free amino acids are produced. 

As for the enzymes many different endo- and exoproteinase activities have been found to 
participate in the production of cocoa flavour precursors, such as an aspartic endoproteinase 
activity (Voigt et al., J. Plant Physiol. 145 (1995), 299-307), which accumulates with the 
30 vicilin-class (7S) globulin during bean ripening or a cysteine endoproteinase activity, which 
increases during the germination process when degradation of globular storage protein 
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increases during the germination process when degradation of globular storage protein 
occurs (Biehl et al., Cocoa Research Conference, Salvador, Bahia, Brasil, 17-23 Nov. 1996). 
Moreover, a carboxypeptidase activity has been identified which preferentially splits hydro- 
phobic amino acids from the carboxy-terrninus of peptides. 

Apart from the enzymes also the protein source of the pepn^es/amino acids seems to be of 
importance for the generation of cocoa flavour precursors. 

During cocoa bean fermentation the percentage reduction of protein concentration observed 
for vicilin and albumin was 88.8% and 47.4%, respectively (Amin et al. J. Sci. Food Agric. 
76 (1998), 123-128). When peptides obtained by proteolysis of the globulin (vicilin ) 
fraction were post-treated with carboxypeptidase, preferentially hydrophobic amino acids 
were released and a typical cocoa aroma was detected after roasting in the presence of 
reducing sugars (Voigt et al., Food Chem. 50 (1994), 177-184). Contrary to that, the 
predominant amino acids released from the albumin-derived peptides were aspartic acid, 
glutamic acid and asparagine and no cocoa aroma could be detected. It was therefore 
concluded that cocoa-specific aroma precursors are mainly derived from the vicilin-like 
globulins of cocoa, which constitute more than 30 % of the total protein contents in the 
mature cocoa seed. Consequently, the mixture of hydrophobic free amino acids and 
remaining oligopeptides required for the generation of the typical cocoa flavour components 
seem to be determined by the particular structure of the cocoa vicilin-class globulins. 

Although it is known that hydrophobic amino acids are important cocoa flavour precursors, 
the specific peptides responsible for generating cocoa flavour during roasting remain by and 
large un-characterized. Consequently, there is a need in the art to provide further structural 
data of such peptides and of the way they are produced from their original proteinaceous 
source in order to be capable to use those peptides for the production of a well-balanced 
cocoa and/or chocolate flavour. 

In WO 91/19801 two major cocoa seed storage proteins and the DNA sequences encoding 
said polypeptides are disclosed. These two proteins exhibit a molecular weight of 47 kDa 
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and 31 kDa, respeGtively, and seem to be the vicilin polypeptides, which are deemed to be 
the source of the flavouring peptides creating the characteristic cocoa flavour. Though the 
polypeptides have recombinantly been provided as such no specific data as to the synthesis 
of the flavouring peptides have been provided. 

5 

Consequently, an object of the present invention resides in further elucidating the generation 
of cocoa flavour from the relevant protein material contained in cocoa. 

In order to solve the above problem research has naturally focused on the vicilin 
10 polypeptides in cocoa beans, since other protein material contained therein was not 
considered to contribute to the generation of cocoa flavour as such. In contrast to this general 
belief the present inventors have now surprisingly found that a polypeptide, being a member 
of the albumin family, also contributes to the characteristic cocoa flavour during 
fermentation and roasting. 

15 

Thus, the present invention provides a novel polypeptide as identified by SEQ ID NO 1 or 
fragments thereof having aN-terminus comprising the amino acid sequences as identified by 
SEQ ID NO 2 or 3, and/or heterodimers of said fragments, hi a preferred embodiment the 
mature polypeptide as identified by SEQ ID No 4 is provided. 

20 

According to another aspect the present invention provides a nucleic acid as identified by 
SEQ ID NO 5, or a derivative thereof, encoding any of the above polypeptide(s). The present 
nucleic acids also comprise DNA molecules that are derived from the nucleic acid identified 
by SEQ ID NO 5 by the degeneracy of the genetic code or by substituting one or more bases 
25 with the proviso that a polypeptide identified by SEQ ID NO 1 will be obtained. The present 
invention also contemplates allelic variations of the nucleic acid indicated. 

In the Figures, 

30 Fig. 1 an SDS-PAGE analysis of different extracts of cocoa acetone powder; 
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Fig. 2 shows an SDS-PAGE analysis of the purified 2S albumin; 
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Fig. 3 shows the predicted oc-helical regions and the hydrophobicity plot for the T. cacao 2S 
precursor protein; 

5 

Fig. 4 is a schematic representation of Brassica 2 S protein processing and the predicted 
processing sites for the 2S protein of T. cacao and shows the amino acid sequence of the mature 
polypeptide (SEQ ID NO 4) according to tryptic peptides mass fingerprints of the purified 
protein; 

10 

Fig. 5 shows a cocoa flavour evaluation of erizymatically hydrolyzed cocoa polypeptides; 

During the studies leading to the present invention the inventors originally tried to find 
peptides derived from the vicilin like globulins present in cocoa. To this end, several 
15 experiments were carried out on cocoa acetone powder, wherein the 21 kDa albumin 
polypeptide was selectively removed. After several purification steps a substantially 
homogeneous protein preparation was found that showed a major band at about 9 kDa and a 
weak band at about 4 kDa. The protein thus obtained was sequenced and two amino acid 
sequences were obtained: 

20 

RPVSK HLDSC CQQLE KLDTP PKRPG LKQAV QQCA; (SEQ. ID. No. 2) 

and 

SKEXS CKXI (SEQ. ID. No. 3) 

25 On the basis of these information a cDNA library prepared from T. cacao was screened for 
nucleic acids encoding such (a) protein(s) and a polypeptide with a theoretical molecular 
weight of about 17 kDa could be located. The nucleic acid and the deduced amino acid 
sequence is shown in SEQ ID NOs 4 and 1 . 

30 As may be seen from a comparison of the amino acid sequences obtained from sequencing 
the purified protein and the amino acid, sequence (open reading frame) derived from the 
nucleic acid both of the sequences are contained in the open reading frame of the subject 
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nucleic acid molecule indicating a post-translational processing of a precursor molecule as 
represented by the 17 kDa polypeptide. 

As is known from other species, e.g. rape seed, a precursor polypeptide of the 2S protein is 
5 subjected to different post-translational processing steps including the generation of two 
subunits that are held together by intra- and inter-chain disulfide bonds. These two peptides 
are produced by removal of peptides at the N-terrninus, between the subunits and at the C- 
terminus of the precursor molecule. On the basis of the information provided a similar 
mechanism seems to take place with the 2 S-polypeptide of T. cacao, as evidenced by the 
1 0 occurrence of two different N-terminal sequences. 



Consequently, according to a preferred embodiment the present invention provides a 
polypeptide, which is derived from the 17 kDa polypeptide, and which has a N-terminus 
comprising the amino acid sequence as identified under SEQ ID NO 2. This part represents a 
15 . subunit of the mature 2S-polypeptide. 

According to yet another preferred embodiment the present invention relates to a 
polypeptide derived from the 17 kDa polypeptide as described herein, and which a N- 
terminus comprising the arnino acid sequence as identified under SEQ ID NO 3. This part of 
20 the 1 7 kDa polypeptide represents another subunit of the mature 2S protein. ' 

Further, it could now be shown that the present 2S protein of T. cacao also yields peptides, 
which upon reaction with reducing sugars results in cocoa flour products (see below). 
Therefore, the present 17 kDa polypeptide or fragment thereof, preferably fragments 
25 comprising the amino acid sequence of the subunits of the mature 2S-polypeptide may be 
recombinantly produced to obtain cocoa flavour precursors which may be used for 
producing cocoa flavour. 

For expression a nucleic acid coding for any of the polypeptides of the present invention 
30 may be incorporated in a suitable vector, with which a cell of interest is transformed. Since 
the polypeptide does not seem to be glycoslated expression in prokaryotic cells is also 
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possible. To this end, the nucleic acid as identified by SEQ ID NO 5 or a fragment thereof 
may be incorporated in an expression vector, such as pUC, pNZ124 (Platteuw et al., (1994) 
Appl. Env. Microbiol. 60, 587), pGK12 (Walke et al., (1996) FEMS Microbiol. 138, 233,) or 
pG+host9 (Maguin et al., (1996) J. Bacteriol 178, 931) etc.. For expression in e.g. 
5 methylotrophic yeast Pichia pastoris, the vector pPICZaA as described Manual of the 
EasySelect™ Pichia Expression kit, version B (Invitrogen, The Netherlands) can be used. 
Heterologous expression in Yarrowia lipolytica can be obtained with the vector pINA1294 
containing a defective ura3 gene (wra3d4) that allows direct selection for multicopy 
integrants (Madzak, C, Treton, B. and Blanchin-Roland, S. (2000) Strong Hybrid Promoters 

10 and integrative expression/selection vectors for Quasi-constitutive expression of 
heterologous proteins in the yeast Yarrowia lipolytica. J. Microbiol. Biotechnol. 2(2): 207- 
216) . For Hansenula polymorpha, B14-derived expression vectors containing the FMD 
promoter and MOX terminator as described in Mayer, A.F., Hellmuth, K., Schlieker, H., 
Lopez-Ulibarri, R., Oertel, S., Dahlems, U., Strasser, A.W.M., van Loon, A.P.G.M. (1999) 

15 Biotechnol. Bioeng. 63:373. It will be appreciated that the skilled person is well aware of 
arranging the corresponding nucleic acid such that an open reading frame is present, such as 
is e.g. necessary for producing the fragments of the 17 kDa precursor. To this end, a start 
codon may be positioned directly in front of the respective N-terminus of a fragment or may 
be positioned such that it is spaced from the polypeptide to be expressed by a linker which 

20 may support the isolation of the resulting polypeptide. Methods for introducing a nucleic 
acid into a vector and for transforming cells with the vector are known to the skilled person 
and may be found in "Maniatis and Sambrook, A Laboratory Manual, Cold Spring Harbor 
(1992), USA". 

25 The cell of interest may be any cell or cell line with which the present polypeptide or a 
fragment thereof may be expressed. Expression in E. coli may be advantageous due to its 
easy handling and the option to choose a variety of different expression vectors for 
expressing polypeptides within the cell or in secreted form. However, the nucleic acids of 
the present invention may well be incorporated in cells of higher origin, such as plant cells, 

30 in particular in cells of T. cacao. In this respect over-expression of the 2S polypeptide may 
be achieved in a recombinant T. cacao plant by incorporating a nucleic acid as identified by 
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SEQ ED NO 5 into a cocoa cell using vectors suitable for plants, such as the Ti-plasmid or 
using the technique of homologous recombination. Such a plant will eventually yield a 
higher content of cocoa flavour precursors. 

5 In the case of producing the polypeptides of the present invention by recombinant means in 
bacteria, yeast or in cell culture in general the polypeptide may be isolated by methods 
known per se and the purified 2S polypeptide may be subjected to a proteolytic degradation, 
using the different enzymes known to participate in the generation of cocoa flavour. In a 
subsequent step the flavour precursors thus obtained may be contacted with sugars such that 
10 a Maillard reaction may take place eventually obtaining cocoa/chocolate flavour. 

However, substances, yielding cocoa flavour are , known to also beneficially affect 
phsiological and/or medical conditions, and may thus be used in the treatment of 
hypertension or mood depression. Also immune modulatory activities are known, such as 
15 improving an individuals capability to cope with bacterial challenges. The present invention 
■ therefore also envisages such usages. 

The following examples illustrate the invention in a more detailed manner. It is, however, 
understood that the present invention is not limited to the examples but is rather embraced by 
20 the scope of the appended claims. 

Example 1 

Identification of a cocoa 2S albumin 

25 Cocoa pods were obtained from experimental farms in Ecuador, Ivory Coast and Malaysia 
and unless stated otherwise all studies were carried out using West African Amelonado 
cocoa beans. Due to the high fat and polyphenol contents, proteins were extracted from 
Cocoa Acetone Powder (CAP). The CAP was prepared from non-defatted cocoa beans as 
follows: Sun-dried unfermented cocoa beans were passed through a bean crusher followed. 

30 by a winnower to remove shells. The cocoa nibs were milled and the nib powder was passed 
through 0.8-mm sieve. The cocoa nib powder was suspended in 80 % (v/v) aqueous acetone, 
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stirred and subsequently centrifuged. The residue was extracted 5 -times with 80% (v/v) 
aqueous acetone and 3-times with 100% acetone. The acetone powder was dried under 
reduced pressure. 

5 CAP (5g) was suspended in 50 ml ice-cold sodium-acetate buffer (50 mM, pH 4.0) 
containing 0.1 mM pepstatin. The suspension was sonicated for 2 x 30 sec with a 10 rnin 
layover interval on ice. The suspension was centrifuged at 20,000 g for 15 rnin at 4 °C. The 
residue was extracted twice with buffer pH4.0 (supra) followed by water extraction 
employing sonication. The residue from the water extract was finally extracted with 100 mM 
10 Tris-hydrochloride, pH 8.5. The supernatant was passed through a sterile 0.22 um filter and 
stored at -20 °C. 

SDS-PAGE analysis of pH 8.5 extracts following two exhaustive washing of the residue 
with pH 4 buffer followed by a water wash showed complete absence of the high intensity 

15 21 kDa protein (Fig.l). Separation was carried out in a conventional manner (Lammli 
(1970)), employing 12.5 % gels. Lane A contains low range molecular markers (Bio-Rad), 
lane B contains total CAP extract with 1 % SDS, lane C contains CAP extracted 2 times with 
100 mM acetate buffer, pH 4,0 (supra), lane D contains a subsequent water extract of the 
residue of lane C and lane D contains an extract of the residue of D with 100 mM Tris- 

20 hydrochloride buffer, pH 8.5. 

It could be shown that the 21 kDa protein could be essentially removed. Further under these 
conditions a protein having a molecular weight of about 9 kDa was detected which was 
further purified. 

25 

Chromatographic steps were performed at room temperature using a BioCad20 
chromatography station (Perseptive Biosystems) The frozen CAP extract was thawed 
overnight at 4 °C and, if necessary, adjusted to pH 8.5 with 1M Tris-chloride buffer, pH 8.5. 
The clear CAP extract was applied to a Resource Q column (Amersham-Pharmacia biotech) 
30 equilibrated with buffer A (50 mM Tris-bis-propane chloride, pH 8.5) at a flow rate of 5 
ml/min. The column was washed with buffer A until A 2S o decreased to below 0.05. The 
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column was eluted with a linear gradient (20 column volumes) of NaCl 0-500 mM in 
buffer A. Fractions were analysed by SDS-PAGE, and those showing the 9 kDa protein were 
pooled, concentrated by ultra-filtration (PM-10 membrane, Amicon). The concentrated 
9 kDa protein fraction was injected onto a HiLoad Superdex 30 column (26 x 600 mm) 
5 equilibrated with 50 mM sodium phosphate buffer, pH 7 containing 100 mM NaCl. The 
column was eluted with the same buffer and fractions were collected. The fractions showing 
the 9 kDa protein were pooled and concentrated by ultra-filtration. The purified protein 
solution was passed through a fast-desalting PD-10 column (Amersham-Pharmacia Biotech) 
for buffer exchange to water, sterile filtered, and stored at -20 °C. 

10 

Two successive chromatography steps, anion exchange and gel filtration resulted in apparent 
homogeneity of the protein preparation as judged by SDS/PAGE followed by Coomassie 
Brilliant Blue staining (Fig. 2). 10 ul samples were diluted 3 x with sample buffer in the 
presence and absence of J3-mercaptoethanol and centrifuged and electrophoresed at 100 V. 
15 The gels were stained for small peptides. Lane A contains a molecular weight marker, lane B 
contains a CAP extract (1 % SDS/50 mM phosphate buffer, pH 7,0) following exhaustive 
extraction at pH 4; lane C purified 2S albumin in the absence of iJ-mercaptoethanol, and lane 
D purifed 2S albumin under reducing conditions. 

20 The glycosylation was assessed employing the glycoprotein detection kit from Bio-Rad. The 
purified cocoa albumin was not found to be glycosylated. 



The polypeptide was subjected to Edman degradation resulting in 2 amino acid sequences to 
be obtained, i.e. RPVSK HLDSC CQQLE KLDTP PRRPG LKQAV QQCA and SKEXS 
25 CKXI. 



Example 2 

Cloning of the 2 S albumin gene 

Total RNA was isolated from mature and less mature seeds according to methods known per 
30 se (Maniatis, supra). Poly. A+ RNA was prepared from the total cocoa seed RNA using the 
Oligtex kit from Qiagen following the kit instructions for 250-500 ug total RNA. The final 
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pellet was resuspended in 10 ]xh of RNase free water, and the concentration of KNA present 
was estimated to be approximately 5-10 ng/uL using Clontech nucleic acid "Quick Sticks". 

The synthesis of cDNA from the polyA+ mRNA was carried out using a SMART PCR 
5 cDNA synthesis kit from Clontech. The method used was as described in the kit instructions. 
For the first strand cDNA synthesis step, 4 uL (20-40 ng) of poly A+ mRNA was used and 
200 units of Gibco BRL Superscript II MMLV reverse transcriptase. The PCR step of the 
SMART protocol was also set up as directed in the kit instructions, except only 2 uL of the 
first strand reaction was added. First, 18 cycles of a PCR were run, then, 35 uL was taken 
10 out of the total reaction (100 uL) and this part of the reaction was run for a further 5 cycles 
ofPCR. 

A pool of two PCR reactions was then prepared; 40 uL of the 18 cycle PCR reaction and 
15 uL of the 23 cycle PCR reaction. 2.5 uL protease K (Boehringer Mannheim, nuclease 

.15 free, 14 u.g/uX) was added to this cDNA mixture and the reaction was carried out at 45 °C 
for one hour. After a brief spin, the reaction was stopped by heating the mixture to 90 °C for 
8 min. The mix was then chilled on ice, and 5 uL of T4 DNA polymerase (New England 
Biolabs) was added (3 units/uL), and the reaction was incubated at 14-16 °C for 30 min. 
Then, 25 uL of Milli Q water, 25 u.L phenol (Aqua phenol), and 25 uL chloroform/isoamyl 

20 alcohol (Ready Red) was added. This mixture was vortexed, spun, and the top aqueous layer 
was taken. The phenol layer was re-extracted with 50 u.L of water. The two resulting 
aqueous layers were then pooled and re-extracted with chloroform/isoamyl alcohol (Ready 
Red). The DNA in the aqueous layer obtained was ethanol precipitated as described above. 
The dried DNA obtained was resuspended in TE buffer (10 mM Tris-HCl pH 8, 1 mM 

25 EDTA) and its concentration was calculated to be approximately 75 ng/uL using nucleic 
acid "Quick Sticks" strips from Clontech. 

The cDNA was then ligated into the PCR-Script Amp SK(+) cloning vector of Stratagene. 
Two uL of the ligated DNA was transformed into Stratagene Ultracompetent cells XL-2 
30 Blue as described in the instruction manual for these cells. 
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Eighteen randomly chosen inserts containing clones of the cDNA library were subjected to a 

single DNA sequencing run using the T3 primer present in the pPRC-Script Amp vector. 
Potential protein coding sequences of these DNA sequences were identified using the 
"Lasergene" suite of DNA analysis programs from DNASTAR Inc. The amino acid 
5 sequences obtained for open reading frames were then compared to the sequences obtained 
in the Edman degradation (example 1, above). For the 18 clones analyzed 3 clones were 
found to contain the same cDNA sequence encoding a protein harboring the amino acid 
sequences as identified for the polypeptide searched for. 

10 The DNA insert , is 718 base pairs in length and an analysis of the protein encoded by this 
cDNA shows that the 2S protein is probably produced first as a precursor having 150 amino 
acids with a calculated molecular weight of 17,125 Da and a pi of 6.15. The amino acid 
composition profile for the precursor 2S protein shows that the cocoa 2S protein has a 
relatively high level of sulfur containing amino acids. 

15 

Example 3 

Biochemical characterisation of the 2S-protein 

LC/ESI-MS Analysis Data of a cocoa 2S albumin: 
20 LC/ESI-MS analysis showed the molecular weight of the mature protein to be 8513±2 Da 
(Fig. 5). Reduction and S-pyridinylethylation resulted in a positive shift of 630 mass units 
(Mr 9,145) indicating the presence of 6 cysteine residues. 

Tryptic peptide mass fingerprinting: 
25 The primary structure of the mature cocoa 2S albumin having a molecular mass of 8513±2 
Da was determined by generating the tryptic peptide mass fingerprints of the reduced and 
pyridinylethylated albumin by RP-HPLC/ESI-MS. A total of 10 peptide masses were 
detected (Table 1). 
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Table 1 

Tryptic Peptide Analysis of Cocoa 2S Albumin by LC/ESI-MS 



Theoretical average 


Sequence position 


Tryptic peptides 


Average observed 


|M+H1 + 






[M+Hf 


3829.042 


4-39 


ND 




1576.735 


105-118 


T8 


1576.7 


1513.673 


119-130 


T7 


1513.6 


1439.568 


55-65 


1® 




1303.577 


83-93 


T9 


1513.5 


836.462 


76-82 


ND 




775.340 


135-141 


T3 


880.4 


730.366 


142-147 


T5 




724.322 


66-73 


ND 


809.4 


704.340 


94-99 


T6 




681.287 


40-45 . 


ND 




665.362 


50-54 


ND 


622.3 


517.280 


100-104 


T4 


505.3 


505.280 


131-134 


T2 




466.208 


148-150 


ND 




388.255 


46-48 


ND 




349.190 


1-3 


ND 




304.162 


74-75 


Tl 


586.4 


147.113 


49-49 


T10 


1536.5/1447.9 


M), not detected 



5 

The mature 2S protein of plants such as Brassica napus (rape seed) and pumpkin (Hara- 
Nishimura et al., "Proglobulin processing enzyme in vacuoles isolated from developing 
pumpkin cotyledons", Plant Physiol. 85 (1987) 440-445) are known to be post-translationally 
processed to generate two subunits. A comparison of the observed tryptic peptide masses of 

10 the mature protein against the translated amino acid sequence showed a 100 % amino acid 
sequence match to the residue 78 to 147 (SEQ ID NO 1). The peptide fragments containing 
the cysteine residues showed the expected positive mass shift of 105 due to S-pyridinyl- 
ethylation. Every identified peptide mass was subjected to MS/MS analysis to determine 
either a complete or partial amino acid sequence to confirm its mapping to the amino acid se- 

15 quence of the albumin. The C-tenninal peptide NWF could not be detected. Also N-terminal 
peptides (sequence residues 1-77) could not be detetcted indicating that the 2S cocoa 
albumin is post-translationally processed to yield a much smaller polypeptide from its N- 
terminal end. 

20 Hydrophobicity: 

Analysis of an hydrophobicity plot for the cocoa 2S precursor protein (SEQ ED NO 1) 
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clearly vindicates that the N-temiinal region of this 2S protein encodes a distinct short 
hydrophobic region that is considered to represent the signal peptide sequence. The 
predicted a-helical regions for the T. cacao 2S precursor shows that the position of the N- 
terminal residue of large cocoa 2S fragment mapped by N-terminal sequencing (position 77 
5 in SEQ ID NO 1) has a noticeable absence of a-helical forming sequences. 

Example 4 

Flavour Potential of the 2S polypeptide 

10 Isolated cocoa polypeptide fractions (lyophilized powder), namely 21 kDa albumin (SA), 
8.5 kDa albumin (2SA), insoluble vicilin protein fraction (InsV) and total polypeptide 
fraction (CPF) were suspended in 100 mM acetate buffer, pH 5 and digested with 1 % (w/w 
total protein) Flavourzyme for 16-24 h. Alternatively, the polypeptides were digested in 
100 fnM acetic acid, pH 3 with 1 % (w/w protein) porcine pepsin for 16-24 h. Both samples 

15 were freeze dried. A subset (at least 70 %) of pepsin hydrolyzed sample was further digested 
with 200 units of carboxypeptidase A. Following analytical analysis (free and total amino 
groups and amino acids), an identical amount of each hydrolysate was reacted with reducing 
sugars as described under the following section. 

20 The process reaction flavours using amino acid residues or protein hydrolysates were 
prepared as follows: The reference model reaction was prepared by reacting 0.8 % Leu, 
1.45 % Phe, 0.8 % Val, 1.5 % fructose, 1.5 % water (4 drops of 50 % (w/v) NaOH in 20 ml 
water) and 94 % propylene glycol at 125 °C (temperature of oil bath) for 60 min under 
reflux. The cocoa protein hydrolysate-based reaction flavours were prepared by replacing the 

25 amino acids with 1 % (w/w) of lyophilized hydrolysate. At the end of the reaction, each 
mixture was cooled to room temperature, and its final pH as well as optical density at 
420 nm was measured. The reactants were transferred in a dark-brown bottle and stored at 
15 °C until sensory profiling. 

30 A panel of 8 persons was used to evaluate the flavour (aroma and taste) of the process 
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reaction on a scale of 1-10 for different flavour attributes. Tasting was performed on 0.1% 
(w/w) solutions in 1% (w/w) sucrose. For each sensory session an average of score data was 
used to evaluate the flavour potential of the various polypeptide fractions. 

The results are summarized in Fig. 5, which shows the evaluation of various precursor pools 
generated from the enzymatically hydrolyzed cocoa polypeptide fractions in the flavour 
assay system. As expected the most cocoa flavour is produced by the vicilin storage protein 
fraction. Surprisingly, also the newly identified 2S albumin showed respectable cocoa 
flavour when hydrolyzed by Flavourzyme or pepsin/carboxypeptidase combination. 
Selection of the. enzyme cocktail for extensive hydrolysis showed no remarkable difference 
suggesting that cocoa polypeptides harbor innate amino acid sequences for generation of 
cocoa flavour. The flavour quality and intensity of 2S albumin was surprisingly superior to 
the highly abundant 21 kDa cocoa albumin. These data strongly support the notion that 2S 
polypeptide together with vicilin storage protein contributes significantly to the 
accumulation of the potential cocoa flavour precursors. 
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Claims 



1 . A polypeptide as identified by SEQ ID. No. 1 . 

2. A polypeptide, derived from the polypeptide according to claim 1 , having a N-terminus 
comprising the sequence as identified by SEQ ID. NO. 2. 

3 . A polypeptide, derived from the polypeptide according to claim 1 , having a N^erminus 
comprising the sequence as identified by SEQ ID. NO. 3 . 

4. A polypeptide being a heterodimer consisting of a polypeptide according to claim 2 
and 3. 

5. A nucleic acid as identified by SEQ ID. No. 4 or a fragment thereof encoding a 
polypeptide according to any of the claims 1 to 3. 

6. An expression vector containing one or more of the nucleic acids according to claim 5. 

7. A cell containing a recombinant nucleic acid according to claim 6 or a vector 
according to claim 6. 

8. Use of a polypeptide according to any of the claims 1 to 4, for the preparation of 
cocoa/chocolate flavour. 

9. Use of a polypeptide according to any of the claims 1 to 4 for the preparation of a 
composition for the treatment of hypertension, mood depression, bacterial infections 
and a weakened immune condition. 

1 0. A method for producing cocoa flavour comprising hydrolysing a polypeptide according 
to any of the claims 1 to 3 by an endopeptidase or in combination with exopeptidases 
and subjecting the resulting peptides to a reaction involving reducing sugars. 
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■21 kDa albumin 



Fig. 1: SDS PAGE analysis of different extracts of cocoa acetone powder (CAP) for removal of 
the 21 kDa albumin. 
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Fig. 2: Tricine-SDS-PAGE analysis of the purified 2S albumin. 
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Fig. 3: Predicted a-helical regions and the hydrophobicity plot for the T. cacao 2S precursor 
protein. 
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1 RPVSKHLDSC CQQLEKLDTP CRCPGLKQAV QQQAEEGEFG REELQEMYET 
51 VDKIMNKCDV EPGRCNLQPR NWF (SEQ ID No 4) 



Fig. 4: Schematic representation of Brassica 2S protein processing (A) and the predicted 

processing sites for the 2S protein of T. cacao (B) as well as the amino acid sequence of 
the mature polypeptide. 
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Fig. 5: Cocoa flavour evaluation of enzymaticaUy hydrolyzed cocoa polypeptides. 
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SEQUENCE LISTING 

<110=> Societe des Produits Nestle 

<120> 2 S -polypeptide of cacao 

<130> 80272 

<140> 
<141> 

<160> 5 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 150 
<212> PRT 

<213> Theobroma cacao 
<400> 1 

Met Ala Lys Leu Gly Leu Leu Leu Ala Thr Leu Ala Leu Val Leu Phe 
1 5 10 15 

Leu Gly Asn Ala Ser Val Tyr His Thr Thr Val Thr Val Asp Ser Glu 
20 25 30 

Glu Asn Pro Trp Gly Ser Lys Glu Ser Ser Cys Gin Lys Gin lie Lys 
35 40 45 

Lys Gin Asn Tyr Leu Lys His Cys Gin Glu Tyr Met Glu Glu Gin Ser 
50 55 60 

Arg Gly Ser Gly Ser Ser Ser Ser Arg Glu Arg Tyr Ser Arg Pro Val 
65 70 75 80 

Ser Lys His Leu Asp Ser Cys Cys Gin Gin Leu Glu Lys Leu Asp Thr. 

85 90 95 

Pro Cys Arg Cys Pro Gly Leu Lys Gin Ala Val Gin Gin Gin Ala Glu 

100 105 110 

Glu Gly Glu Phe Gly Arg Glu Glu Leu Gin Glu Met Tyr Glu Thr Val 
115 120 125 

Asp Lys lie Met Asn Lys Cys Asp Val Glu Pro Gly Arg Cys Asn Leu 
13 0 135 140 
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Gin Pro Arg Asn Trp Phe 
145 150 



<210> 2 
<211> 34 
<212> PRT 

<213> Theobroma cacao 
<400> 2 

Arg Pro Val Ser Lys His Leu Asp Ser Cys Cys Gin Gin Leu Glu Lye 
1 5 10 15 

Leu Asp Thr Pro Pro Arg Arg Pro Gly Leu Lys Gin Ala Val Gin Gin 
20 25 30 

Cys Ala 



<210> 3 
<211> 9 
<212> PRT 

<213> Theobroma cacao 
<400> 3 

Ser Lys Glu Xaa Ser Cys Lys Xaa He 
1 5 



<21Q> 4 
<211> 73 
<212> PRT 

<213> Theobroma cacao 
<400> 4 

Arg Pro Val Ser Lys His Leu Asp Ser Cys Cys Gin Gin Leu Glu Lys 
1 5 10 15 

Leu Asp Thr Pro Cys Arg Cys Pro Gly Leu Lys Gin Ala Val Gin Gin 
20 25 30 

Gin Ala Glu Glu Gly Glu Phe Gly Arg Glu Glu Leu Gin Glu Met Tyr 
35 .40 45 



Glu Thr Val Asp Lys He Met Asn Lys Cys Asp val Glu Pro Gly Arg 
50 55 .60 
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Cys Asn Leu Gin Pro Arg Asn Trp Phe 
65 70 



<210> 5 
<211> 718 
<212> DNA 

<213> Theobroma cacao 
<400> 5 

aagcagtggt aacaacgcag agtacgcggg 
tatatatcta tatccaccat ggcaaagctc 
ctcttcctcg gcaatgcctc cgtttaccac 
ccttggggaa gcaaagagag cagctgtcag 
cactgtcagg agtacatgga ggagcagtcc 
cgctacagcc gccccgtgag caagcaccta 
gatacgccgt gccgttgccc tggtctaaaa 
gagtttggga gggaagagtt gcaagagatg 
tgtgacgtag agcctggaag gtgtaacttg 
aagatcagag ctgcctgatc taatgtaaaa 
tggtgttcta actggaggtt tttggggtga 
cacattctcg tgtgcggttg ctttttgctt 



gaagaaccaa agccttgtca tctaactagc 60 
ggtctcctcc tagccaccct tgctcttgtt 120 
accaccgtca cggttgacag cgaggaaaac 180 
aagcagataa agaagcaaaa ctacctcaag 240 
agaggcagcg gcagcagcag cagccgtgag 3 00 
gactcctgtt gccagcaact ggagaagctc 360 
caggcagtgc agcaacaggc ggaagaggga 420 
tatgagacgg ttgacaagat catgaacaag 480 
caacctcgca actggttcta gagagaaaga 540 
caatgactgt aatgtttcac ccatcaactc 600 
ctagaagtac tgataatcca taaataaaag 660 
caggccaaaa aaaaaaaaaa aaaaaaaa 718 
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